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ABSTRACT 

Extreme value statistics (EVS) is applied to the distribution of galaxy luminosities in the Sloan 
Digital Sky Survey (SDSS). We analyze the DR8 Main Galaxy Sample (MGS), as well as the Luminous 
Red Galaxies (LRG). Maximal luminosities arc sampled from batches consisting of elongated pencil 
beams in the radial direction of sight. For the MGS, results suggest a small and positive tail index ^, 
effectively ruling out the possibility of having a finite maximum cutoff luminosity, and implying that 
the luminosity distribution function may decay as a power law at the high luminosity end. Assuming, 
however, ^ = 0, a non-parametric comparison of the maximal luminosities with the Fisher- Tippett- 
Gumbel distribution (limit distribution for variables distributed by the Schechtcr fit) indicates a good 
agreement provided uncertainties arising both from the finite batch size and from the batch size 
distribution are accounted for. For a volume limited sample of LRGs, results show that they can 
be described as being the extremes of a luminosity distribution with an exponentially decaying tail, 
provided the uncertainties related to batch-size distribution are taken care of. 

Subject headings: methods: statistical — Galaxies: statistics — galaxies: general — galaxies: lumi- 
nosity function — galaxies: fundamental parameters 



1. INTRODUCTION 

Extreme value statistics is a powerful tool for an- 
alyzing the behavior of the tails of distributions. It 
is well-known that the distribution of extreme val- 
ues for a sample of iV-i.i.d. (independent, identi- 
cally distributed) random variables converge (as N 
oo) to a few limiting distributions depending on the 
tail behavior of the parent population, namely Fisher- 
Tippett-Gumb e l, Weibull and Fisher-Tippett-Frechet 
(IGumbell [T958t IGalainbod ITOTSl: lE^brechts et al.l [19971: 
IReiss fc ThOTmdll997l: IColesll2001ir However, the onset 
of this finite sample size scaling behavior is quite slow, 
and therefore requires very large samples to converge. 
This is the primary reason why astronomy has seen few 
applications of EVS to date. 

The emergence of dedicated wide angle galaxy sur- 
veys, such as SDSS (jStoughton et al.ll2002|) . has made 
possible an increase in statistics, making galaxy sam- 
ples in the SDSS redshift survey just large enough to 
attempt an analysis of the finite sample size scaling for 
all galaxies. Here we chose to study the distribution 
of maximal luminosities of galaxies, since the galaxy 
luminosity distribution per volume or luminosity func- 
tion (LF) is one of the most basic statistic measured in 
galaxy surveys. This function has been well described 
by a gamma d i stribu tion or so-called Schechter func- 
tion (jSchechted Il976f ). functionally similar (and moti- 
vated by) the theoretically de rived Press-Schechter for- 
mula (jPress fc SchechteH[l97l . with a power law distri- 
bution at the faint end and an exponentially falling tail 
at the bright end. When galaxies are grouped accord- 
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ing to their morphologies, their respective LFs seem to 
belong to different classes, including bell-shaped distri- 
butions as well as gamma functions of different s hape and 
scale parameters (jBinggeli. Sandage fc Tamman n 1988). 
Current modeling of the conditional LF (CLE) to galaxy 
clusters in dark matter halos of a certain mass include the 
presence of central or brightest cluster galaxies (BCGs) 
with a log-normal CLE, while the rest of the galaxies 
(satellites) are giv en a power law CLF with a fini t e cut at 
high luminosities (jCoorav fc Milosavlievicl[2"005t iCooravl 
[20M ). 

Special attention has always been paid to the high 
luminosity tail of the all galaxy LF. BCGs are the 
brightest of the old populations of red elliptical galax- 
ies found in the high density cores of galaxy clusters are 
thought to have their progenitors formed at high red- 
shift (z > 3), and then have un dergone a set of dry 
mergers in their life history (e .g. lOstriker fc HausmanI 
Il977t iDe Lucia k BlaizotI 120071 ). Their importance lies 
in the low scatter of their luminosities , making them 
useful as standard candles (iPo stman fc Laued [19951: 



Loh fc Straus^'2006': 'Li n et al.ll20i0l: iParaniape fc ShethI 
201lH Dobos & Csabai ,20111). 



Several studies have been made to elucidate whether 
BCGs are the extremes of a red or early type galaxies LF 
or they come from other lu minosity distribution . In or - 
der to answer the que st ion. iGeller fc Tremai nd ([19761) . 
iTremaine fc Richstond (|1977[) and iBh avsar fc Barrow! 
(|1985t) investigated the statistics derived from the first 
and second brightest luminosities (and the gap between 
them) in galaxy clusters. Their r esults based on smalle r 
samples has been confirmed by iLoh fc Strausd ()2006D . 
who found that the luminosity gap between first and 
second-ranked galaxies is substantially larger than what 
can be explained with an exponentiall y decayin g lumi - 
nosity function. On the other hand, iLin "etall (pOlOl ) 
shuffled the data to combine all galaxies of clusters to 
form a composite cluster, finding that BCGs in high lu- 
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minosity clusters arc not drawn from the luminosity dis- 
tribution of all red cluster galaxies, while BCGs in less 
luminous clusters are consistent to be the statistical ex- 
treme. 

These previous studies were mainly directed toward the 
luminosity statistics within galaxy clusters. In this pa- 
per we will study galaxy luminosities as a whole, and the 
sampling will not be restricted to the maximal luminosi- 
ties from galaxy clusters. This will keep the sample size 
large enough for studying the finite-size scaling behavior. 

Since EVS is well known only for i.i.d. variables, one 
approach we will follow is trying to minimize the correla- 
tions between luminosities and positions by selecting the 
maximal luminosities from batches or blocks of galaxies 
in elongated regions or pencil beams along the line of 
sight, and defined b y the footprint of t he HEALPix tes- 
sellation on the sky (jGorski et al.ll2005l ) . As we shall dis- 
cuss, such elongated cells combined with the short range 
correlations in luminosities make possible an analysis of 
EVS based on the assumption that the luminosities ap- 
proximate well an i.i.d. behavior. This approach allows 
us to show a working example designed to mimic the 
standard block maxima sampling method from EVS of 
time series, but also generalizing it to the case of variable 
block size, as discussed later in this paper. Another sim- 
pler approach we will use for comparing with the previous 
method is the random sampling of the luminosity parent 
distribution in batches of fixed size. These are new ap- 
proaches for testing the bright end of the overall LF, and 
inherently different from previous studies that considered 
testing the luminosity extremes in galaxy clusters. 

Within the i.i.d. framework, the shape of the galaxy 
luminosity function is important for the EVS. The expo- 
nential tail in the high luminosity end of the LF would 
imply a Fisher-Tippett-Gumbel (FTG) EVS distribu- 
tion, with corrections for the finite sampl e sizes depend- 
ing o n the power law at lower luminosities ()Gv6rgvi et al.l 
l2008f ). In this analysis we will test the agreement with 
these expectations, and the analysis will also reveal 
whether or not a sharp cutoff at a high but finite lu- 
minosity exists. 

We emphasize that even though the SDSS sample is 
large, the residual from the FTG distribution can be ex- 
plained only when we consider the corrections due to 
both the finite size of the samples of each HEALPix pen- 
cil beam and the distribution present in the sample sizes 
(the number of galaxies in a cone is finite and varies from 
cone to cone). Thus we have here a pioneering example 
where a generalized finite size scaling (including sample- 
size distribution) is relevant in the data analysis. 

The arguments and results will be presented in the fol- 
lowing order. In Section[2]wc describe our galaxy sample. 
Scction|3]shows the fits to the galaxy luminosity distribu- 
tions and functions. Section [4] explains the construction 
of the pencil beams and distribution of galaxy counts 
inside them. Section [S] contains a discussion of the ba- 
sic concepts of extreme value statistics with emphasis on 
possible deviations from the expected limit distributions 
due to finite number of the galaxies in the pencil beams 
and, furthermore, due to the pencil-to-pencil fluctuations 
in the galaxy counts. In Section [S] we present the results 
about the distribution of maximal luminosities with the 
conclusion that within the uncertainties coming from the 
finiteness of samples and from the sample-size distribu- 



tion, the Fisher-Tippett-Gumbel distribution gives an 
excellent fit. The final remarks and discussion can be 
found in Section [71 

Along this paper, we use the (J^l, ^^m, ^q, wq) — (0.7, 
0.3, 0.7, -1) cosmology. 

2. SAMPLE CREATION 

In this paper we use photometric and spectroscopic 
data of galaxies from SDSS-DR8 (lYork et al.l \20m 
iStoughton et al.l 120021: lAihara et all 120 lit) , available in 
a MS-SQL Server database that can be queried on- 
line via CasJobs 0, and analyzed directly inside the 
databa se using an integrated c osmological functions li- 
brary (jTaghizadeh-Popq I2010D . The galaxies studied 
were the DR7 legacy sp ectroscopically-targe ted Main 
Galaxy Sample (MGS) (jStrauss et al.[ I2002D. as well 
as th e luminous red galaxies (LRGs) (jEisenstein et al.l 
l2001t) . The sky footprint of the clean spectroscopic sur- 
vey builds up from a complicated geometry defined by 
sectors, which cover a fractional area Fa — 0.1923 of 
the whole sky. Redshift incompleteness arises from the 
fact that two 3" aperture spectroscopic fibers cannot be 
put together closer than 55" in the same plate. As a 
strategy, denser region in the sky are given a greater 
number of overlapping plates. However, only ^93% 
(MGS) and ~95% (LRG) of the initial galaxies photo- 
metrically targeted have their spectra taken. 

Several selection cuts and flags were applied in order to 
have a clean sample. We selected only science primary 
objects classiflcd as galaxies and appearing in calibrated 
images having the photometric status flag. We used 
the score quantity as a measure of the field quality 
with respect to the sky flux and the width of the point 
spread function, and selected only the flelds in the range 
0.6 < score < 1.0. Furthermore, we neglected indi- 
vidual objects with bad deblending flags (PEAKCENTER, 
DEBLENDJJOPEAK , NOTCHECKED ) and interpolation 
problems (PSF_FLUX_INTERP , BAD_COUNTS_ERROR) or 
suspicious detections (SATURATED NDPRDFILE ), as wcU 
with problems in the spectrum (ZWARNING) 0. 

With respect to the MGS, they were observed as a 
magnitude limited sample, with a targeted r-band pet- 
rosian apparent magnitude cut of < 17.77, and a red- 
shift distribution peaking at z ^ 0.1. We further restrict 
this sample to safe cuts of [m^.i, mr.2] = [13.5,17.65]. 
The lower limit is set due to the arising cross talk from 
close flbers in the spectrographs when they carry light 
from very bright galaxies, whereas the upper limit safely 
avoids the slight variations in the targeting algorithm 
of the limiting apparent magnitude around 17.77 over 
the sky. As shown in Fig. [H we chose galaxies in the 
redshift interval [21, ^2] = [0.065,0.22], since at redshift 
lower than zi, the galaxy high luminosity tail becomes 
incomplete (due to imposing the apparent magnitude cut 
at mr,i). This left us with Ng = 348975 MGS galaxies 

in a volume of = [^(^^2) - ^^(^i)] x Fa = 0.559Gpc^ 
With respect to the LRGs, they where selected from 
color cuts (in g-r v/s r-i space) in such a way that 
are traced across redshift as an old population of lu- 
minous and passively evolving red early type galaxies 
(jEisenstein et al.l 120011 ). This was done by modeling 

http://casjobs.sdss.org 
^ Detailed explanation in sdss3.org/dr8/algorithms 
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Figure 1. Completeness limits for the raw galaxy samples. Plot- 
ted are the petrosian absolute Magnitude v/s redshift histogram 
(log-scaled) for the full MGS and LRG samples. No k-correction 
nor evolution is applied at this point. The red horizontal lines show 
the absolute magnitude limits for a complete sample (-22.53 and 
-22.35 for MGS and LRG respectively). Note that for redshifts 
greater than z=0.065 (vertical red line), the MGS becomes com- 
plete at the bright end. This redshift limit was also checked using 
the Vmax method explained in Sec. |3] 

them with an ol d stellar population spectral tem plate 
from PEGASE (jFioc fc Rocca-Volmerangj [19971) . We 
use LRGs in the CUT I sample, which was built to be al- 
most a volume limited sample up to redshift 0.38 with an 
r-band petrosian apparent magnitude cut of mr^2 = 19.2. 
We apply a safe redshift window of [zi,Z2\ = [0.20,0.38] 
since, at lower redshift, the color selection cuts admit 
blue galaxies belonging to the MGS. We further con- 
straint this LRGs by considering galaxies whose r-band 
surface light profile can be modeled mainly as a DeVa- 
coulcurs profiles (as in elliptical g alaxies) more than a n 
exponential disc (fracDeV > 0.9) (jStrateva et al.ll200ll ). 
We also use the r-band concentration index R90/R50 
(jShimasaku et all 120011: iStrateva etldl I2001D to select 
mostly elliptical galaxies (R90/R50> 2.7 ). This left us 
with Ng =52579 LRGs in a volume of Vs = 2.18Gpc^ 

Since our samples span broad redshift and time in- 
tervals, it is crucial to apply a (k-l-evolution)-correction 
to Mr in the form A/,. = jti,. - DM{z) - k{z) - e{z), 
which brings all the galaxies to a common z = Q rest- 
frame. The k-corrections for the MGS were calculated 
by modeling each galaxy spectrum as the closest non 
negative linear combinat ion of spectra drawn from the 
iBruzual fc Chariot (12003 ) tern plates (see lBudavari et aTl 
(|2000f) and lCsabai et al. H2OOO)). We applied a simple av- 
erage evol ution correction as a l inear function of redshift, 
derived bv lBlanton et all (|2003D as e(z)=-Qz (Q=1.62 for 
r-band, Q=4.22 for u-band). For the LRG case, we used 
the k-|-evolution correction derived from the PEGASE 
template. This wa s modeled a s a 4th order polynomial i n 
redshift, as used in|Lo3 (|2004[ ) and lLoh fc Strauss! (|2006[ ). 
where k(z)-|-e(z)=0.115z +bMz'^-2A.Qz'^+m.Qz'^ . 

We finally checked the first 1000 images of galaxies for 
each sample ranked by brightest r-band petrosian abso- 
lute magnitude, and rejected the objects whose photom- 
etry appears to be ruined by the leaked light of a nearby 
star. Also, objects where rejected in the case when the 
petrosian magnitude was more different than 0.8 magni- 
tudes compared with the model magnitude. 

3. LUMINOSITY FUNCTIONS AND DISTRIBUTIONS 

The luminosity function (LF), defined as the distribu- 
tion of galaxy luminosities (or magnitudes) per volume, 
has been for long well studied as a basic statistic. Since 
galaxy surveys are generally apparent magnitude limited 



at the faint end, the LF differs from the luminosity distri- 
bution (LD) in that the former cannot be obtained from 
a simple raw histogram of the luminosity data points as 
LDs are. In fact, the faint luminosity tail of LDs is in- 
complete, as faint galaxies can be observed only at close 
enough distances (Malmquist bias). On the contrary, 
the brightest galaxies can generally be observed over the 
whole redshift limits of the survey. As a consequence, 
LFs are identical to LDs at the bright end (except for a 
scale factor equal to the survey's volume Vs) but start 
to depart from each other at a departure luminosity Ld 
(specified next). 

The important link between LFs and LDs is that, since 
they behave the same way at the bright end, we can 
study LFs in this regime by instead doing the sampling 
and EVS on the LD of the individual data points. This is 
the strategy followed in this paper, which works as long 
as we sample galaxies with luminosities close enough to 
or brighter than Lq. 
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Figure 2. 2-Dimensional histogram (in logarithmic gray-scale) 
showing the onset of absolute magnitude incompleteness. For the 
LRGs (left branch), we see ui^ > 1 and ui; > 2 starting at Mr = 
-22.63 and Mr = -22.05). For the MGS (right branch), it happens 
at Mr = —22.43 and -21.91, respectively. The curves against the 
vertical axis are the corresponding (unnormalized) distrilautions of 
the weights Wi. 



In order to correct the incompleteness of low luminos- 
ity galaxies, we construct LFs by adding more weight to 
these galaxies, as used in the Vmax method (jSchmidtl 
|1968[ ). where each z-th galaxy is assigned a weight Wi = 
Vs/VM,i > 1- Here we note that, given the particular 
[zi , Z2] and [mi , 7712] intervals for the survey, the i-th 
galaxy found at Zi could be observed only within a max- 
imum comoving volume Vm.j inside the overall volume 
Vs of the survey. If the i-th galaxy of apparent mag- 
nitude TUi, k-correction ki = k{zi), evolution correction 
Ci = e{zi) and at a luminosity distance D^^Zi) were to 
have limiting apparent magnitudes mi. 2, then it should 
be moved to a limiting luminosity distance given by 



£'L,i(-2lim;'Tli,2) = 
DL{Zi) X 10(™i.2-'i:(2lim)-e(zii„,)-mi + fei+ei)/5^ (|-[^^ 

Hence, the maximum volume is defined by the biggest 
interval oi inside which a galaxy can appear in the 
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Figure 3. Luminosity functions and distributions of the differ- 
ent galaxy samples, together with the best Schechter, generalized 
gamma, Gumbel and EVD fits from Tables [T] and (2] The vertical 
lines denote the completeness boundary Wi = I, where the LPs 
depart from LDs. The LDs were scaled by the factor Ng/Vs in 
order to compare them with the LPs. 

survey: 

VM,i = [V{min{DL{z2), £'L,j(2iim; "12))) 

-V{uia^iDLizi),DLAznm;mi)))] x Fa, (2) 

As Eq. [1] defines zum in an implicit way, we solve for 
it iterativcly. The weights Vs/VM,i are shown in Fig. 
O The departure magnitudes separating the complete 
{wi = 1) and incomplete (wi > 1) parts of the samples 
take the values of A/d = -22.63 (LRG) and Mn = -22.43 
(MGS). Thus, the distribution of weights looks bimodal, 
where the complete part of the sample creates the spike 
at Wi = 1, and the incomplete part forms the broad tail. 
Note that the incomplete part presents at the beginning 
a nearly linear trend given by log if ^ f -^^r (derived from 
Eq. [1]). The part of the trend that departs and seems 
extending into the high log w region, on the other hand, is 
composed of galaxies whose apparent magnitude is very 
close to the limiting apparent magnitude cut TO2 of the 
survey. 

A non-parametric LF can be then easily estimated us- 
ing a Vmax weighted histogram in the form 

The extra weight Ci takes into account the incomplete- 
ness of the target selection algorithm for spectroscopic 
follow up. The error 5^{Mr) is estimated by using Jack- 
Knife sampling of 38 regions about 200 SqDeg each. 

The luminosity functions and distributions and their 
fits are shown in Fig. [3l and Tables [T] and [2] contain the 
fitting parameters. 

For the MGS samples, we use a generalized gamma as 
a fittin g function, whic h results in the known Schechter 
profile ()Schechterlll976D when /3 = 1 : 

<P{L)dL = (L/L^f exp{- [{L/L.f] }dL. (4) 

The normalization factor is left as a free parame- 
ter for LFs, whereas for the LDs is defined by = 
/3/[L,r((a + 1)//?, (Lniin/i*)^)], where the incomplete 
gamma function r(*, (Lmin/i*)^) is the integral in the 
interval [(L,„in/L,)'^, oo]. The fitted luminosity values 
for the LFs and LDs we converted into magnitude units 
using Mr=—2.5\ogiQ[L/LQ]+AlQ^r, where MQ^r = 4.62 



Table 1 

Luminosity Function Pitting parameters." 



Sample 


<E>. [10^^ 


M, 




f3 




Mpc^'^Mag^^] 










SipVlPflltPT" 


Fittinp- 






MGS 


3.10 ± 0.05 


-21.46 ± 0.02 


-1.34 ± 0.04 


1 




General 


Gamma 


Fitting 




MGS 


7.79 ± 0.38 


-20.42 ± 0.10 


-0.81 ± 0.05 


0.75 ± 0.02 


Sample 


*. 




Ma 






[10^^ Mpc"^] 










Gumbel 


Fitting 






LRG 


2.52±0.03 


-22.85±0.01 


-21.36±0.02 







GEV 


Fitting 






LRG 


2.49±0.02 


-22.86±0.01 


-21.36±0.02 


0.04±0.01 



" Parameters from fitting to Eq. [2(MGS) or Eq. [5] (LRG). 
Parameters in luminosity units (L*, and La) were con- 
verted into absolute magnitudes (M*, and A/o-) using 
M=-2.51og^o[L/L0]-|-Mo, where Mq = 4.62, everything 
measured in the petrosian r-band. MGS and LRG samples 
are fitted in the ranges Mr < -20.2 and Mr < -22.64 re- 
spectively. 

(jBlanton et al.|[2001[ ) and L© is the solar luminosity in 
the r band. 

From Figure [3l we can see that the generalized gamma 
function provides a better fitting for the MGS LF than 
the Schechter fit. Indeed, /3 = 0.75 fits much better the 
hig h luminosity ta i l, and is similar to the value found 
by iBernardi et all ()2010[) (/3 = 0.698). Our faint end 
slope {a = —0.81) is steeper compared to their value 
{a = —0.45), although we are fitting in a different mag- 
nitude interval and to a galaxy sample of different mag- 
nitude and redshift selection cuts. The errors in the 
magnitude have little infl uence in the value of t he fitted 
parameters. As shown in IBernardi et al] ()2010D . bigger 
magnitude errors might decrease the fitted value of /?. 
However, they showed that the inclusion of the <0.05 
rms errors on the magnitude in SDSS provided discrep- 
ancies in the fitting parameters generally smaller than 
their statistical errors. 

The LRG sample was made to include the brightest 
early types. Therefore, they are naturally better fitted 
with an extreme value distribution (EVD) or its special 
case the Gumbel where ^ = (sec Sec. [5|): 

$(L) = $,EVD(L), (5) 
EVD(L) = L-H{L)i+^ cxp{-t{L))dL, (6) 
t{L) = (1 + 1 + > 0. 

with Lfj^, La and ^ being respectively the location, scale 
and shape (or tail index) fitting parameters. The LRGs 
were built to be a complete (volume limited) sample, but 
some scattered lower luminosity galaxies passed the color 
cuts and contaminated it (Fig. [1]). Therefore, we only 
fit the LF up to the completeness limit A/q = —22.64 as 
explained earlier. 

4. SAMPLING THE MAXIMAL LUMINOSITIES: CREATION 
OP I.I.D. BATCHES AND HEALPIX-BASED PENCIL 
BEAMS 

Classic extreme value statistics needs closc-to-i.i.d real- 
izations of the underlying parent probability distribution 
from which to draw the maximal values. 
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Table 2 

Luminosity Distribution Fitting parameters." 



Sample 




a 




General 


Gamma Fitting 


MGS 


-19.99 ± 0.16 


1.52 ± 0.10 0.79 ± 0.03 



" Parameters from fitting to Eq. |4] (MGS) using 
$. = P/{UV{{a + l)/p,(Lmin/L.f)). We used A-W = 
M(Lmin) = —20.2. Parameters in luminosity units were con- 
verted into absolute magnitudes. 



In the EVS of time series, a common practice is to 
use the block maxima approach, where the (possibly 
correlated) data set is grouped into disjoint and tem- 
porally consecutive blocks or batches of the same size 
from whic h to choose the extreme values (e.g. annual 
maxima ) (lEmbrechts et al.lll997l: iReiss fc Thomaslll997l : 
[GoiillOOf- The blocks can be chosen to span each of 
the many different cycles of the underlying process, gen- 
erally creating n blocks, with all the blocks having the 
same number N (batch size) of data points. Thus, a 
first simplified sampling strategy in this paper is the one 
where the disjoint batches are chosen by random sam- 
pling without replacement of the luminosity values. 

In real life situations, however, some blocks might 
present missing or sparse data, due to bad sampling 
strategies or sensor failures. In other cases, the data 
points clump in clusters of different sizes that exceed a 
certain threshold level (e.g., insurance claims after a hur- 
ricane). In all these situations, the different realizations 
of the parent probability distribution will have different 
number N of (possibly correlated) data points, with dis- 
tribution P(A^). 

In order to show a real working equivalent example 
of the previous time series process, the second sampling 
strategy generalizes the block maxima approach by 
extending it to the case of variable block size and weak 
enough correlations (the meaning of weak is discussed 
m Section To this aim, we recreate this situation 
by dividing the sky in equal-area patches, each one 
defined by an indivi dual cell of the HEALPix tessellation 
(|G6rski et al.l 120051 ). This creates 1-dimensional pencil- 
like beams, each of which containing one closc-to-i.i.d 
realization of the galaxy distribution through rcdshift 
and with a variable galaxy number N. Wc start with a 
finer SDSS DR8 spectroscopic footprint with resolution 
Nstde = 512 (of ceU size ~ 6.87')- We further 

degrade the footprint into 3 lower resolution maps 
defined by Nside = 16,32 and 64, creating thus the cells 
that define the pencil beams. Note that these bigger cells 
may partly cover an area not belonging to the footprint. 
Hence, we define the fractional area occupancy / as the 
area inside the footprint covered by the cell divided by 
the total area of the cell. The cumulative distribution of 
/ (Fig. E]) shows clear breakpoints at / ^ 0.97 for all 3 
resolutions. We therefore decide to use only the group 
of cells which satisfy / > 0.97. A summary on the 3 
different resolution HEALPix schemas is presented in 
Table [31 and HEALPix maximal luminosity maps arc 
shown for the MGS in Fig. g] 
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Figure 4. HEALPix Maps of maximal luminosities (in linear 
scale) for the MGS galaxy sample at different values of Ngj^g. 
Darker color means higher luminosity. Only cells included 97% 
in the footprint are shown. The SDSS-DR8 footprint boundaries 
become evident at resolution N^^^^ = 64. 
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Figure 5. Cumulative distribution of the HEALPix cell fractional 
occupation area /, shown for the 3 different footprint cell sizes. The 
black filled dots show the breakpoints at / = 0.97. We consider 
only cells with / > 0.97 

4.1. Distributions of galaxy counts in a HEALPix cell 

The number of galaxies N fiuctuates in the pen- 
cil beams. Unless the galaxy-count distribution in a 
HEALPix cell J-{N) is very narrow, this affects the limit 
distribution one expects for the extreme luminosities. We 
have thus evaluated -FiN) for the MGS and LRG sam- 
ples, and the results for decreasing Ngide, i-c for increas- 
ing average of the galaxy count {N) are shown in Figl6l 

As one can see, the Nside = 64 suggest good statistics 
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Figure 6. Galaxy count distributions in a HEALPix cell of size N^^^^g, = 64, 32, and 16, i.e. for increasing average galaxy count (A'^). 
Gamma function of the form J^{N) ~ {N/N — d)^ e-^pl— N / N + d) fits well the distributions as can be seen and, to a good approximation, 
count distributions for different galaxy samples can be scaled together by the average number of galaxies in a c ell (N) (secon d row). 
Note tha t equally good fits can also be provided by negative binomial distributions used in previous studies, e.g. ICroton et al.l (2007), 
lYang &rSaslawl 1120111 ) and references therein. 



Table 3 

HEALPix schemas for the different galaxy samples 



HEALPix schemas 



^ side 




'^sphere 


np 


np 


nF,97 


nF.97 








MGS 


LRG 


MGS 


LRG 


16 


3.66° 


3072 


768 


755 


473 


473 


32 


1.83° 


12288 


2659 


2591 


2030 


2029 


64 


55.0' 


49152 


10017 


9461 


8492 


8256 



° Here, Usphere is the total number of HEALPix cells in the 
sky (each of area ^Ipix), np is the number of galaxy-containing 
cells inside the footprint and 72^,97 is the number of cells with 
areas included at least 97% inside the footprint. 

yielding smooth functions. We should note, however, 
that the average galaxy count is rather low in this case 
{{N) « 5 — 40), so it is far from the limit {N) — > 00 one 
would like to take when investigating the EVS. 

There are two lessons to learn from FiglHl First, the 
distributions are rather narrow which suggests that it is 
a reasonable assumption that the theory of EVS known 
for fixed N can be applied to galaxy luminosity. Second, 
one can develop analytic approximations to these his- 
tograms. Indeed, the distributions can be relatively well 
approximated by a gamma function with free location 
and scale parameters. 

5. THEORY OF EXTREME VALUE STATISTICS 

5.1. Classical Theory 

Extreme value statistics (EVS) is concerned with the 
probability, PN{v)dv, of the largest value in a batch of 
N measurements {vi,V2, ■■■,vn} being v = maxiW^. For 



us, the w,;S are galaxy luminosities, obtained by either 
random sampling N galaxies from the sky, or chosen from 
the HEALPix cells covering the sky, each with a variable 
N. 

The results of the EVS are simple provided the ViS 
are i.i.d. variables drawn from a general parent distri- 
bution f{vi). Namely, the limit distribution P/v-i.oo(i') 
belongs to one of three types and the determining fac- 
tor is the large-argument tail o f the parent distribution 
(|Gumbe]|[l958t lGalamboj|1978[) . Frechet type distribu- 
tion emerges if / decays as a power law, Fisher- Tippett- 
Gumbel (FTG) distribution is generated by /s which de- 
cays faster than any power law and, finally, parent distri- 
butions with finite cutoff and power law b ehavior around 
the cutoff yield the Weibull distribution (jGumbell [l95l 
[Ga lambos 1978). All the above cases can be unified as a 
generalized EVD whose integrated distribution Fn{v) is 
given in the N ^ 00 limit by 



F{v) = exp < — 



1 + e 



(7) 



with parameters /i, s and ^. The shape parameter ^ can 
take values ^ > 0, = 0, < 0, which correspond to the 
Frechet, FTG, Weibull classes, respectively. The param- 
eter ^ is also called the tail index, since it is related to 
the exponent of the large-argument power-law behavior. 
The probability density function associated to Eq. ([7]) is 
shown in Eq. 

The EVS has been developed mainly for i.i.d. vari- 
ables and there are only a few well established results 
for systems with correlations between the ViS. These re- 
sults are mainly related to sufficiently weakly correlated 
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variables where the i.i.d. results ca n be shown to apply 
(|Bermannlll964l : iGvorgyi et al.|[2007l ) . In the following we 
shall assume that the correlations between the galaxy lu- 
minosities are sufficiently weak so that the experimental 
histograms can be compared with the i.i.d. results. This 
assumption is important for the sampling in HEALPix 
cells in the sky, but not in a random sampling schema 
(arguments in favor of this assumption will be discussed 
in Section [7| using the knowledge of the correlations be- 
tween galaxy positions). 

The parent distribution for galaxy luminosities is 
known to be well fitted by the Gamma-Schechter distri- 
bution = $4L/L*)" exp[-(i/L,)'3] as given in (g]) 
where L* sets the scale, and a w — 1 together with /? = 1 
is the Schechter profile. For this parent distribution, the 
theory of EVS tells us that the limit distribution of ex- 
tremal luminosities belongs to the FTG class 0) 



Piv) = 



dFjv) 
dv 



a exp 



(8) 



where the parameters can be fixed by setting (v) = 
and a ~ (u^) — (v)^ = 1 , yielding a = tt / \/6 and 
5 = 7^; « 0.577. It should be emphasized that this choice 
leads to a parameter-free comparison with the empiri- 
cal data. In fact, the histogram of the maximal lumi- 
nosities P{v) should be plotted in terms of the variable 
X = {v— (v) n)/o'n where (w) ^ is the average of the max- 
imal luminosity while on = \/ (w^) n — {vYn is its stan- 
dard deviation. The resulting scaling function should 
approach the universal function in the iV — > oo limit 

PNix) = aNPN{<JNX + {v)n) P{x) . (9) 
5.2. Deviations from the Classical Theory 



In addition to the assumption of ViS being i.i.d. vari- 
ables, there are two additional problems with the pro- 
gram of comparing the data with the theory. First, a no- 
torious aspect of EVS is the slow convergence of Pat (a;) 
to the limit distribution P{x). Second, the batch size 
N (the number of galaxies in a given solid angle) varies 
with the direction of the angle. Thus the histogram of the 
maximal luminosities Pn{x) is built from a distribution 
of A^s. Both of the above effects introduce corrections to 
the limit distribution we are trying to use for comparison. 
Below we estimate the magnitude of these corrections. 

5.2.1. Finite size corrections 

Finite size corrections in EVS have been studied in 
detail with the main conclusion that to first order in the 
vanishing correction in the iV — > oo limit, the scaling 
function can be written as 



PN{x)^Pix)+q{N)Pi{x), 



(10) 



where q{N oo) — )■ and the shape correction Pi{x) is 
universal function. Both the amplitude q and the shape 
correction Pi {x) are known for Schechter type parent dis- 
tributions. The convergence to the limit distribution is 
slow since we have (Gyorgyi et al. 2008„ .201^) 



q{N) = - 



In^iV 



(11) 



for /3 = 1. In the general case of a parent following the 
generalized gamma distribution of Eq. (|4]), with /3 w 1, 



there are two terms which may have comparable contri- 
butions (with the shape correction function being iden- 
tical) 



qiN) 



(1-13) , (2/3-l)(/3-a-l) 



/31niV 



13^ In^ N 



(12) 



Note that this theoretical construct needs the values of 
a and /3 to be fitted at the bright end tail of the luminos- 
ity distribution, thu s neglecting the low luminosity tail 
(jGyorgvi et al.ll2010[ ). The value of {3 is roughly 1, thus 
for a characteristic range of ~ 10 — 200, the amplitude 
is of the order of 0.2-0.04. Thus one can expect a 20-4% 
deviations coming from finite-size effects. 

The finite-size s hape correction is also known 
(|Gvorgvi et aLllMish : 



Piix) 



P{x) 



ax 



C(3)a 



(13) 



where a = tt /^/Q and C,{z) is the Riemann zcta function. 
The function Pi{x) is plotted on FiglT] and one can see 
that the first order correction has well defined signs in 
various regions of x. 

A special case arises when the parent distribution is of 
Gumbel type. In this case, the EVD is also a Gumbel, 
but with no apparent finite size correction. This is due to 
the fact that the Gumbel distribution is a fixed point in 
the renormalization theor y formalism used for obta ining 
the first order corrections (|Gvorgvi et al.ll2008ll2010[) . As 
a result, the deviations should be caused only by random 
shot noise from the data points. 




Figure 7. First order finite size sliape correction function in finite- 
size scaling of EVS witli the Scliechter function being tlie parent 
distribution. The amplitude of tliis correction is of the order of 
l/\nN iov P while it is of the order of 1/ In^ if ^ = 1. 



5.2.2. Variable batch size 

Variable sample size raises basic questions about EVS. 
In particular, the limiting procedure of sample size go- 
ing to infinity becomes a problem. If the normalized 
distribution of A^ is known T{N) then it is natural to 
consider the average N = J F{N)NdN as the parame- 
ter corresponding to the fixed sample size of the usual 
EVS. Therefore, using the limit A^ — oo, the extreme 
value distribution becomes 



P{v) = ^im 

N-yoc 



T{N)PNiv)dN . 



(14) 
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Once P(v) is known, wc can write it in scaled variables 
thus obtaining P{x) and the difference ^"1(2;) = P{x) — 
P{x) provides us an estimate of corrections coming from 
the variable sample size. 

The actual calculation of P{x) assumes that we know 
T{N). A simple form of -FiN) which fits the observed 
distribution reasonably well (see Figl6|) and allows ana- 
lytic calculations is given by 

where fc = 3 and d is a free parameter distinct from zero, 
since there is a finite cut in N (TV > Nq = N). Note that 
here we assumed that the distribution can be written in 
a scaled form 



each other, depending on the parent distribution and on 
J-(iV). 



T{N)^fiN/N)/N. 



(16) 



This is a good approximation to all of the experimental 
distributions. Using the above J^{N), one finds that the 
limit distribution is universal within the FTG class (and 
so, for the Schechter function parent distribution as well) 

p(^) ^ [k + l + d{l + e--)]eM-de-- -v) 

The appropriately scaled distribution (x variable) for 
the case of d = and A: = 3 is given by the following 
expression 



Pix) 



4a exp(— ax — b) 
(1 + exp(— ax — b))^ 



(18) 



where 



^-f and 6= if. 

3 36 6 




Figure 8. Comparison of the FTG limit distribution (red) with 
that obtained from the varia ble s ample size case with the sample- 
size distribution given by Eq.l llSIl (green line, for the case of a! = 
and fc = 3). The difference of the two functions is also shown, 
being in the order of 10%. 

The functions P{x) and P{x) (FTG), and their differ- 
ence is displayed on FiglHl We can see that the maximal 
difference ^"1(2;) ~ P{x) — P{x) is of the order of 10%. 
What is more interesting is that the positive and neg- 
ative regions of the differences are significantly shifted 
compared to those of the finite size corrections (Figl?]). 
Thus the two correction may amplify as well as cancel 



6. 



DISTRIBUTION OF MAXIMAL LUMINOSITIES AND 
THE EMPIRICAL FIRST ORDER CORRECTIONS 



In order to compute statistics on the maximal lumi- 
nosities, we used the 2 sampling methods (random sam- 
pling and HEALPix-based batches) explained in section 
m As we deal with a fixed number Ng of galaxies, there 
is a bias-variance trade-off in all statistics calculated. In 
fact, increasing the number of batches n does indeed de- 
crease the variance. However, at the same time the data 
points per batch TV decreases, which departs us from the 
ideal case of TV — )• od, having an increase in the bias. All 
the statistics are then subject to the balance between TV 
and n. 

One interesting statistic we measured empirically is 
the tail index ^ on the extreme value distribution in Eq. 
([6|), which is the probability density distribution associ- 
ated to Eq. ([7]). The importance of ^ is that it specifies 
whether the parent distribution has an infinite reaching 
tail (^ > 0) or a finite cut (^ < 0) at a certain maximum 
luminosity. Next, we measured as well the first order 
finite size correction for the random sampling method, 
as well as the influence of the variable batch size in the 
HEALPix-based method. 

6.1. Statistics from random sampling batches 

6.1.1. The Tail index ^ 

The tail index ^ can be readily calculated in standard 
EVS using the maximum likelihood estimator on Eq. 
(l6|),i.e., we find numerically the values L^,L^ and ^ that 

maximize In H!~f ^'' EVP (LJL,,, L^-, £) using the Nelder- 
Mead algorithm (jPress et al.ll2007[ ). We tried this for 
various combinations of n and TV. The fitted parameters 
are in Table [H with probability distributions of the max- 
imal luminosities (in magnitude-space) shown in Figure 
[9l Note that the maximal luminosities arc mostly sam- 
pled in the region where Wi < 2, which assure us that 
we are sampling also from the luminosity function. Note 
the good overall fit to the EVDs, as well as the increase 
of Lf^ as TV increases. The value of for the LRGs is 
about twice the size of that for the MGS. As the amount 
n of batches decreases with TV, the dispersion and errors 
in the parameters also increase at higher TV as expected. 

For the MGS sample, the value of ^ seems to be posi- 
tive but very close to zero, with ^ < being unlikely to 
happen. Note that for the MGS, the tail index decreases 
with increasing TV, so the deviation of ^ from zero may 
be just a finite size effect. In fact, we can observe that 
<^ ^ 9 (TV), which is actua lly the theoretical p rediction if 
we assume a FTG EVD (IGvorgvi et all 1201(11) . The case 
of the LRG is quite clear, where the value of the tail 
index does not have a dependence on TV, having ^ « 
within the errors. 

Fig. 9 also shows the averages of the maximum mag- 
nitude (Tkr*^^*) as function of the batch-size TV. As one 
can see, the results for both the MGS and LRG sam- 
ples are well fitted by the theoretical large- TV asymptote 
a In (In TV) 4- b which follows from the EVS of an exponen- 
tial parent distribution for the luminosities. The test of 
the theory, however, is not very stringent since TV varies 
less than 1 and 1/2 decades. 
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Figure 9. Le/t:Distribution of r-band absolute magnitude Mr of the full samples. Also included are the distribution of the maximal 
luminosity for each sample according to the fixed batch size sampling. The vertical lines show the points where Wi = 1 (left line) and 
Wi = 2 (right line). Right: Mean of the maximal luminosity distributions (in absolute magnitude space) v/s batch size N using the fixed 
batch size sampling. The dashed lines are fits to the asymptotes (Af^^') ~ InlnA'' following from the EVS theory for i.i.d. variables with 
exponentially decaying parent distributions. 



Table 4 

Fitting parameters for maximal luminosities? 



Sample n L^t Lcr C 

[iO^°La] [lO^°La] 

MGS 24 14540 7.99 ± 0.03 1.98 ± 0.03 0.086 ± 0.011 

(Arg=348975) 50 6979 9.48 ± 0.05 2.10 ± 0.04 0.091 ± 0.017 

100 3489 10.96 ± 0.08 2.26 ± 0.06 0.089 ± 0.023 

200 1744 12.59 ± 0.12 2.43 ± 0.09 0.079 ± 0.033 

LRG 24 2190 17.91 ± 0.12 2.72 ± 0.09 0.002 ± 0.028 

(Afg=52579) 50 1051 19.86 ± 0.18 2.81 ± 0.13 -0.016 ± 0.038 

100 525 21.75 ± 0.26 2.80 ± 0.18 -0.008 ± 0.054 

200 262 23.69 ± 0.36 2.77 ± 0.25 -0.010 ± 0.075 



° Parameters from the maximum likelihood fitting of the 
extreme value distribution in Eq. [B] Maximal luminosity 
values are sampled at n batches of fixed size A^. Quoted 
are the l-u standard errors. Ng denotes the total number of 
galaxies in each sample. 

6.f .2. The First order Finite Size Correction 

Motivated by the presence of a finite- iV, we analyzed 
the behavior of the empirical finite size corrections for the 
EVD, and plotted them in Fig. [TOl Since the estimated 
values of ^ in the previous section are zero or a small 
positive number, which is difficult to specify precisely, 
we assumed for simplicity ^ = and used the theoreti- 
cal corrections in 15.2. f I (theoretical corrections for ^ 7^ 
are not developed yet). Here, the empirical corrections 
are obtained by standardizing the maximal luminosities 
and subtracting them from the standard Gumbel distri- 
bution. For plotting the theoretical corrections with an 
amplitude given in Eq. (|f 2p . we need appropriate val- 
ues of a and j3. As explained in See. 15.2.11 these fitting 
parameters should come from fitting the high luminosity 
tail. Therefore, the fitted values of the full LD in Table 
[2] should not be used. In our case. Fig. [3] shows that 
the full luminosity function fit (Table [T|) is a much better 
approximation of the high luminosity tail, and we use it 
instead. 

Fig. [TOl shows that the empirical corrections for MGS 
galaxies do have the same shape as the theoretical first 
order correction. The amplitude of the function approx- 
imately agrees, but we found that the empirical ampli- 
tude does not increase significantly when TV decreases. 
The explanation is that as N becomes smaller, we start 
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Figure 10. Empirical finite size corrections from n batches of 
fixed size A'^ (from Table (Jjl. Each simulation is made by random 
sampling (without replacement) of a fixed number N of luminosity 
data points in order to mimic the block maxima approach in EVS. 
Random sampling with replacement provides similar results. Also, 
the theoretical first order correction in Eq. 1101 is shown for the 
cases N=24,50,100,200 (with decreasing amplitude). 

sampling the maximal luminosities from the bulk of the 
luminosity distribution instead of its high luminosity tail. 
Of course, the departure of LF and LD in this regime 
makes the LF fitting parameters a and /3 no longer valid 
for calculating the theoretical corrections. A better fit 
could be attained from a and /? parameters obtained by 
fitting the LD to slightly fainted magnitudes than the 
departing magnitude Mo- The other consideration is 
the fact that we might need adding the next term in the 
correction, which could be important if A'^ is small. 

In the LRG case, we cannot find a systematic correc- 
tion, but just noise. This is in agreement with the ex- 
pected behavior as explained in Sec. 15.2. li since the 
Gumbel parent distribution is a fixed point in the renor- 
malization theory used fo r calculating the corrections 
(|Gvorgvi et al.l[200l [Ml . 

6.2. Statistics from HEALPix-based hatches 

The distributions of the maximal luminosities (in mag- 
nitude space) for the HEALPix-based method are shown 
in Fig. 111! Here, the low luminosity tails of the max- 
imal distributions reach farther into the low luminosity 
regions (around — Mo) than in the random sampling 
method. The reason is that some of the HEALPix cells 
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have very low values of N. The maximal luminosities, 
however, are mostly sampled in the region where Wi <2. 
As this is the high luminosity region where the LD and 
LF mostly coincide, all the results obtained from analyz- 
ing the bright end of the LD can be also extended and 
associated with the LF. 



MGS 



LRG 




Magnitude Parent distribution 

Max. Lum., N„a„= 16. N = 567, n=2838 
Max. Lum., N„ja= 32, N = 142, n=1218( 
Max. Lum., N,u,= 64, N = 36, n=50952 



3r 



"33" 



"32" 



Magnitude Parent distribution 
lulax. Lum.. Nad,= 1 6, N = 78, n=473 
lUlax. Lum., Ns,j,= 32, N = 20, n=2029 
lylax. Lum., Ns,a,= 64, N = 5, n=8256 



"3l r, -24.5 -24.0 -2l5 -2S.0 -2^.5 
Mr 



Figure 11. Parent distribution of r-band absolute magnitude Mr 
for the full samples. Also included are the distribution of the max- 
imal luminosity for each sample according to different HEALPix 
resolutions. The vertical lines show the points where Wi = I (left 
line) and Wi = 2 (right line). 

Fig. [12]presents the distributions of maximal luminosi- 
ties (in luminosity space) observed in a HEALPix cell for 
the three studied resolutions {Ngide = 16, 32, 64) and for 
the two galaxy populations. The distributions on this fig- 
ure are scaled to zero mean and unite deviation in order 
to compare them with the similarly scaled Gumbel dis- 
tribution. The theoretical first order correction coming 
from the finite-size (finite N) effects, together with the 
correction to the i.i.d. limit distribution coming from the 
distribution J^{N) of the galaxy counts are also shown 
on Fig. [121 Except for the cases of Ngide = 16, where 
the statistical noise has larger amplitude than the cor- 
rections, it appears that the sum of this two corrections 
is of the order of the residuals and have the same func- 
tional shape. At the highest order resolution of the maps 
{Nsize = 64), however, the batch sizes are rather small 
and the finite-size corrections become large, and appear 
to be in accord with the theoretical predictions. 

The LRGs are special in the sense that finite size effects 
do not emerge in their EVS, because the parent distribu- 
tion itself is Gumbel, as explained in l5.2.1l Consequently 
the only correction to the limit distribution comes from 
the variable batch size N, in agreement what we see in 
the fourth column in Fig. [121 

We carried out simulations as well in order to confirm 
our theoretical results by modeling the empirical situa- 
tion with less noise. Sampling the fitted functions of the 
empirical parent and sample size distributions with high 
statistics {n = 10^) we get smoother histograms. As can 
be seen on Fig. 1121 the simulated corrections are indeed 
a good model for the empirical corrections, and support 
the theoretical expectations as we can observe the con- 
vergence toward the theoretical curve for increasing TV. 

7. DISCUSSION AND CONCLUSION 

Studying extreme statistics may have several out- 
comes. One may discover that the objects under con- 
sideration have a well defined i.i.d. type extreme value 
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Figure 12. The normalized maximum luminosity histograms 
(black circles) for N^^^^ = 16, 32, 64 (from up to down) for the four 
galaxy samples compared to the limit distribution FTC (solid red 
line) in scaled variables {{x) = and (7 = 1) while blue crosses are 
the residuals to the FTC. For the MGS, the solid magenta curves 
show q(N)Pi{x) + Pi{x), i.e., the first order finite size correction 
for the Scliecht er p arent added to the variable batch size correction 
calculated for l|15p . The LRG curve is different, in the sense that 
the parent is FTG and the finite size corrections do not appear, 
having corrections only due to the variable batch size {Pi{x)). 
The black solid curves are the simulations that result from using 
the experimentally given luminosity distributions and sample size 
distributions. 



distribution. This may then lead to the conclusion (pro- 
vided the distribution is of the WeibuU type - i.e. the 
shape parameter is negative) that the underlying objects 
have an intrinsic cutoff in size. In our case the luminosi- 
ties have an i.i.d. EVS, but the shape parameter ^ is 
in the positive range and very close to zero. Thus our 
conclusion here is that the MGS and LRG luminosities 
do not have an upper cutoff. 

As far as the LRGs are concerned, we should note that 
the same conclusion about the absence of an upper cutoff 
can be reached by a straightforward fit to the high end 
of the luminosity function. On the other hand, there are 
difficulties with the agreement of the Schechter fitting 
to the bright end of the MGS LF (e g. iMadgwick et al 

aH 120031 ISmith et al 



200l iBell et~al] [200l iBlanton et 



2009f ). Here the EVS analysis suggests that the root of 
the problem may be a small positive tail parameter ^. 
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This possibility was also noted bv lAlcaniz fc Liroal (|2004[ ) 
with their proposal of the generalized double power law 
fitting function for the LF. 

Of course, the conclusion that MGS galaxies do not 
have a finite luminosity cutoff and ^ has a small positive 
value is valid only if the methods used in the study are 
robust against possible corrections arising in the analysis. 
Uncertainties may come from the finite size of the sample, 
from the distribution of the number of objects in the 
sample, and from the correlations among the objects. 
We have taken care of the finite-size effects by including 
the first order corrections in the limit distributions and, 
furthermore, we handled the fluctuations in the sample 
size by explicitly calculating their effect for the i.i.d. case. 

As evidenced by Figs. [10] and [121 a parameter free 
comparison with the data suggests good agreement with 
the corrections (for the case ^ = 0) being the right or- 
der of magnitude as well as of the right shape. Thus, 
we believe that the above effects are in agreement with 
the conclusion about the absence of upper cutoff in the 
luminosity. Of course, the agreement proved to be valid 
when the fitting parameters come from a parent distribu- 
tion fitted in the high luminosity tail, as expected from 
the theory. If the batch size N decreases and the peak of 
the maximal luminosities moves into the lower luminos- 
ity region, agreement with theory should obtained only if 
the fitting is performed in an extended luminosity inter- 
val considering the lower luminosity values. Since the LF 
is basically constructed from a weighted LD, the strategy 
of sampling the maxima from the bright luminosity tail 
of the LD (where both the LF and LD coincide) was a 
key part of our analysis. Thus it would be of interest for 
future studies to develop an extended extreme value the- 
ory, where the all data points coming from a given class 
of parent distributions are each counted with different 
weights. Such a theory may help in analyzing data sets 
where there is incompleteness even at the tail where the 
extremes are sampled from. 

The correlations pose a more difficult problem. For 
one dimensional systems, it is known from the studies 
of l/f" type signals that the correlation s arc irrelevant 
if they are "weak" (jCvorgvi et al.l[2007[ ). Weak means 
that the integral of the correlation function is finite. The 
effectively one-dimensionality of the pencil beam geome- 
try considered in this paper allows the application of the 
weakness criteria for the luminosity correlations. Indeed, 
one may argue that the luminosity correlations Cl(?') are 
proportional to the density correlations Cp(r) which, at 
large distances decay as Ci(r) ~ Cp{r) ^ 1/r^. The 
one-dimensional integral of this type of correlations is 
convergent, thus wc believe the weakness criteria is sat- 
isfied, and our conclusion is not affected by the corre- 
lations [note that any power relationship between the 
correlations (CL(r) ^ C^{r)) will also satisfy the criteria 
of weakness provided fi > 1/2]. 

We can thus conclude that the extreme value statistics 
of galaxy luminosities is i.i.d. type with zero or small 
positive shape parameter, and this conclusion takes into 
account the finite-size of the samples, the galaxy-number 
fluctuations in the pencil beams, and the large-distance 
spatial correlations among luminosities. 
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